A Data Cleaning Framework for Enabling User Preference Profiling through Wi-Fi Logs
نویسندگان
چکیده
Nowadays mobile devices have become a ubiquitous medium supporting various forms of functionality and are widely accepted for commons. In this study, we investigate using Wi-Fi logs from a mobile device to discover user preferences. The core ideas are two folds. First, every Wi-Fi access point is with a network name, normally a human-readable string, called SSID (Service Set Identifier). Since SSIDs are often with semantics, from which we can infer the place where the user stayed. Second, a Wi-Fi log is produced when the user is near a Wi-Fi access point. A high frequency of a consecutively observed SSID implies a long stay duration at a place. To the best of our knowledge, our work is the first attempting to understand users from the collected Wi-Fi logs from mobile devices. However, Wi-Fi logs are essentially of various information types and with noises. How to assess the information types, eliminate irrelevant information, and clean up the noises within partial-informative SSIDs are therefore keys for profiling user preferences over Wi-Fi logs. In this paper, we propose a data cleaning and information enrichment framework for enabling the user preference understanding through collected Wi-Fi logs, and introduce a data clean framework for cleaning, correcting, and refining Wi-Fi logs. In addition, a comprehensive experiment with data collected from users is made to verify the effectiveness of the proposed techniques for cleaning noisy Wi-Fi data for user preferences profiling. We believe that this work opens a new direction for understanding users from a different perspective, and we make available the code and the collected data set used in this study to encourage further research in this direction.
منابع مشابه
I Data Mining Techniques and Analysis of Concept Based User Profiles from Search Engine Logs
Search engine logs are emerging new type of data user profiling component of any personalization interesting opportunities for data mining. Early user profiling work on mining data mostly attempted to discover knowledge at the level of queries based on objects that users are interested in positive preferences but not the objects in negative preferences. In our paper we focus on search engine lo...
متن کاملTechnical Report for "Incentivizing Wi-Fi Network Crowdsourcing: A Contract Theoretic Approach"
Crowdsourced wireless community network enables individual users to share their private Wi-Fi access points (APs) with each other, hence can achieve a large Wi-Fi coverage with a small deployment cost via crowdsourcing. This paper presents a novel contract-based incentive framework to incentivize such a Wi-Fi network crowdsourcing under incomplete information (where each user has certain privat...
متن کاملSynergic Effect of TC99-m Gamma Radiation and Non-ionizing Radiation of Wi-Fi on Count, Morphology and Motility of Sperms in Rats: An Experimental Study
Background: Given the effects of ionizing radiation on biological tissues and their irreversible tissue damage, this project aimed to determine the synergic effect of TC99-m gamma radiation and non-ionizing radiation of Wi-Fi on sperm characteristics in rats. Materials and Methods: Sixty adult male rats, weighing 250-200 g randomly divided into four groups (three experimental groups and one co...
متن کاملPrecise Indoor Localization Platform Based on WiFi- GeoMagnetic Fingerprinting and Aided IMU
The proposed system exploits Wi-Fi and GeoMagnetic fingerprintings aided by Inertial Measurement Unit (IMU). Basically, the system only consists of the off-the-self Wi-Fi Access Points (APs) and low-cost IMU, while no extra equipment is required. Since the accuracy of Wi-Fi fingerprinting heavily depends on the number of APs deployments, spatial differentiability, and the fluctuation of Receive...
متن کاملAn Efficient Algorithm for Data Cleaning of Web Logs with Spider Navigation Removal
The World Wide Web is growing massively larger with the exponential growth of websites providing the user with heaps of information. Text files called as web logs are used to store the clicks of a user whenever a user visits a website. Web usage mining is a stream of web mining that involves the applications of mining techniques to be applied on the server logs containing the user clickstreams....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015